Education and debate 


Effect of interpretive bias on research evidence 

Ted J Kaptchuk 

Doctors are being encouraged to improve their critical appraisal skills to make better use of medical 
research. But when using these skills, it is important to remember that interpretation of data is 
inevitably subjective and can itself result in bias. 


Facts do not accumulate on the blank slates of 
researchers’ minds and data simply do not speak for 
themselves. 1 Good science inevitably embodies a 
tension between the empiricism of concrete data and 
the rationalism of deeply held convictions. Unbiased 
interpretation of data is as important as performing 
rigorous experiments. This evaluative process is never 
totally objective or completely independent of 
scientists’ convictions or theoretical apparatus. This 
article elaborates on an insight of Vandenbroucke, who 
noted that “facts and theories remain inextricably 
linked ... At the cutting edge of scientific progress, 
where new ideas develop, we will never escape subjec- 
tivity.” 2 Interpretation can produce sound judgments or 
systematic error. Only hindsight will enable us to tell 
which has occurred. Nevertheless, awareness of the sys- 
tematic errors that can occur in evaluative processes 
may facilitate the self regulating forces of science and 
help produce reliable knowledge sooner rather than 
later. 

Interpretative processes and biases in 
medical science 

Science demands a critical attitude, but it is difficult to 
know whether you have allowed for too much or too 
little scepticism. Also, where is die demarcation 
between the background necessary for making 
judgments (such as theoretical commitments and pre- 
vious knowledge) and die scientific goal of being 
objective and free of preconceptions? The interaction 
between data and judgment is often ignored because 
there is no objective measure for the subjective 
components of interpretation. Taxonomies of bias usu- 
ally emphasise technical problems that can be fixed. 3 
The biases discussed below, however, may be present in 
the most rigorous science and are obvious only in 
retrospect. 

Quality assessment and co nfir mation bias 

The quality of any experimental findings must be 
appraised. Was the experiment well performed and are 
the outcomes reliable enough for acceptance? This 
scrutiny, however, may cause a confirmation bias: 
researchers may evaluate evidence that supports their 
prior belief differently from that apparendy challeng- 


ing these convictions. Despite the best intentions, 
everyday experience and social science research 
indicates that higher standards may be expected of evi- 
dence contradicting initial expectations. 

Two examples might be helpful. Koehler asked 
297 advanced university science graduate students to 
evaluate two supposedly genuine experiments after 
being induced with different “doses” of positive and 
negative beliefs through false background papers. 4 
Questionnaires showed that their beliefs were success- 
hilly manipulated. The students gave significantiy 
higher rating to reports that agreed with their 
manipulated beliefs, and the effect was greater among 
those induced to hold stronger beliefs. In another 
experiment, 398 researchers who had previously 
reviewed experiments for a respected journal were 
unknowingly randomly assigned to assess fictitious 
reports of treatment for obesity. The reports were 
identical except for the description of the intervention 
being tested. One intervention was an unproved but 
credible treatment (hydroxycitrate); the other was an 
implausible treatment (homoeopathic sulphur). Qual- 
ity assessments were significantly higher for tire more 
plausible version. 5 Such confirmation bias may be 
common.” 1 " 2 
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Definitions of interpretation biases 

Confirmation bias— evaluating evidence that supports 
one’s preconceptions differendy from evidence that 
challenges these convictions 

Rescue bias — discounting data by finding selective faults 
in the experiment 

Auxiliary hypothesis fo'as— introducing ad hoc 
modifications to imply that an unanticipated finding 
would have been otherwise had the experimental 
conditions been different 

Mechanism bias — being less sceptical when underlying 
science furnishes credibility for the data 
“ Time will tell” bias— the phenomenon that different 
scientists need different amounts of confirmatory 
evidence 

Orientation bias — the possibility that the hypothesis 
itself introduces prejudices and errors and becomes a 
determinate of experimental outcomes 


Expectation and rescue and auxiliary 
hypothesis biases 

Experimental findings are inevitably judged by expec- 
tations, and it is reasonable to be suspicious of evidence 
that is inconsistent with apparently well confirmed 
principles. Thus an unexpected result is initially apt to 
be considered an indication that the experiment was 
poorly designed or executed. 6 * 3 This process of 
interpretation, so necessary in science, can give rise to 
rescue bias, which discounts data by selectively finding 
faults in the experiment. Although confirmation bias is 
usually unintended, rescue bias is a deliberate attempt 
to evade evidence that contradicts expectation. 

Instances of rescue bias are almost as numerous as 
letters to the editors in journals. The avalanche of let- 
ters in response to the Veterans Administration 
Cooperative randomised controlled trial examining 
the efficacy of coronary artery bypass grafting 
published in 1977 is a well documented example. 7 The 
trial found no significant difference in mortality 
between 310 patients treated medically and 286 
treated surgically. A subgroup of 113 patients with 
obstruction of the left main coronary artery, however, 
clearly benefited from surgery. 8 Instead of settling the 
clinical question, the trial spurred fierce debate in 
which supporters and detractors of the surgery 
perceived flaws that, they claimed, would skew the evi- 
dence away from their preconceived position. Each 
stakeholder found selective faults to justify pre- 
existing positions that reflected their disciplinary 
affiliations (cardiology v cardiac surgeon), traditions of 
research (clinical v physiological), and personal 
experience. 9 

Auxiliary hypothesis bias is a form of rescue bias. 
Instead of discarding contradictory evidence by seeing 
fault in the experiment, the auxiliary hypothesis intro- 
duces ad hoc modifications to imply that an 
unexpected finding would have been otherwise had 
the experimental conditions been different. Because 
experimental conditions can easily be altered in so 
many ways, adjusting a hypothesis is a versatile tool for 
saving a cherished theory." 4 Evidence pointing to an 
unwelcome finding in a randomised controlled trial, 
for example, can easily be dismissed by arguments 
against the therapeutic dose, its timing, or how patients 


were selected. Lakatos termed such reluctance to 
accept an experimental verdict a scientist’s “thick 
skin.” 10 Thus, when early randomised controlled trials 
showed that hormone replacement therapy did not 
reduce the risk of coronary heart disease, 11 advocates 
of hormone replacement therapy argued that it was 
still valuable for primary prevention because the study 
group was women with established coronary heart dis- 
ease, making the disease too far advanced to benefit 
from the treatment. 

Plausibility and mechanism bias 

Evidence is more easily accepted when supported by 
accepted scientific mechanisms. This understandable 
tendency to be less sceptical when underlying science 
furnishes credibility can give rise to mechanism bias. 
Often, such scientific plausibility underlies and overlaps 
the other biases I’ve described. Many examples exist 
where with hindsight it is clear that plausibility caused 
systematic misinterpretation of evidence. For example, 
the early negative evidence for hormone replacement 
therapy would have undoubtedly been judged less cau- 
tiously if a biological rationale had not already created a 
strong expectation that oestrogens would benefit the 
cardiovascular system. 12 ” 5 Similarly, the rationale for 
antiarrhythmic drugs for myocardial infarction was so 
imbedded that each of three antiarrhythmic drugs had 
to be proved harmful individually before each trial 
could be terminated. 13 w6 And the link between Helico- 
bacter pylori and peptic ulcer was rejected initially 
because the stomach was considered to be too acidic to 
support bacterial growth. 14 

Waiting for more evidence and “time will 
tell” bias 

The position that more evidence is necessary before 
making a judgment indicates a judicious attitude that is 
central to a scientific scepticism. None the less, 
different scientists seem to need different amounts of 
confirmatory evidence to feel satisfied. This discrep- 
ancy in duration conceals a subjective process that 
easily can become a “time will tell” bias. The evangelist, 
at one extreme, is quick to accept the data as good evi- 
dence (or even proof). Evangelists often have a vested 
intellectual, professional, or personal commitment and 
may have taken part in tire experiment being assessed. 
At the other extreme are the snails, who invariably find 
the data unconvincing, perhaps because of their 
personal and intellectual investment in old “facts.” At 
the two extremes, as well as at all points in between, 
there is no objective way to tell whether good 
judgment or systematic error is operating. Max Planck 
described the “time will tell” bias cynically: “a new 
scientific truth does not triumph by convincing its 
opponents and making them see the light, but rather 
because its opponents eventually die, and a new 
generation grows up that is familiar with it.” 15 

Hypothesis and orientation bias 

The above categories of potential biases all occur after 
data are collected. Sometimes, however, conviction may 
affect the collection of data, creating orientation bias. 
Psychologists call this the “experimenter’s hypothesis as 
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Summary points 


Evidence does not speak for itself and must be 
interpreted for quality and likelihood of error 

Interpretation is never completely independent of 
a scientist’s beliefs, preconceptions, or theoretical 
commitments 


On the cutting edge of science, scientific 
interpretation can lead to sound judgment or 
interpretative biases; the distinction can often be 
made only in retrospect 

Common interpretative biases include 
confirmation bias, rescue bias, auxiliary 
hypothesis bias, mechanism bias, “time will tell” 
bias, and orientation bias 

The interpretative process is a necessary aspect of 
science and represents an ignored subjective and 
human component of rigorous medical inquiry 


an unintended determinant of experimental results.” 16 
Thus, psychology graduate students, when informed 
that rats were specially bred for maze brightness, found 
that these rats outperformed those bred for maze 
dullness, despite both groups really being standard labo- 
ratory rats assigned at random. 17 Somehow, experimen- 
tal and recording errors tend to be larger and more in 
tire direction supporting tire hypothesis.” 7 ” 8 

Numerous studies have noted that randomised 
controlled trials sponsored by the pharmaceutical 
industry consistendy favour new therapies. 18 Research 
outcomes seem to be affected by what the researcher is 
looking for. It is unclear to what extent these apparent 
successes are the result of publication bias or matters of 
study design. Nonetheless, such results are consistent 
with an orientation bias and explain the fact drat some 
early double blind randomised controlled trials 
performed by enthusiasts show efficacy— like hyper- 
baric oxygen for multiple sclerosis 19 ” 9 or endotoxin 
antibodies for Gram negative septic shock 20 — whereas 
subsequent trials cannot replicate the outcome. 19 

Comments 

This article is written fr om the perspective of philosophy 
of science. From a statistical point of view, the arguments 
presented are obviously compatible widr a subjectivist or 
bayesian framework that formally incorporates previous 
beliefs in calculations of probability. But even if we 
accept that probabilities measure objective frequencies 
of events, the arguments still apply. After all, the overall 
experiment still has to be assessed. 

I have argued that research data must necessarily 
undergo a tacit quality control system of scientific 
scepticism and judgment that is prone to bias. 
Nonetheless, I do not mean to reduce science to a naive 
relativism or argue that all claims to knowledge are to 
be judged equally valid because of potential subjectiv- 
ity in science. Recognition of an interpretative process 
does not contradict the fact that the pressure of 
additional unambiguous evidence acts as a self regulat- 


ing mechanism that eventually corrects systematic 
error. Ultimately, brute data are coercive. However, a 
view that science is totally objective is mythical, and 
ignores the human element of medical inquiry. Aware- 
ness of subjectivity will make assessment of evidence 
more honest, rational, and reasonable. 21 

This article is a shortened version of a paper written for a semi- 
nar on bias led by Fredrick Mosteller at Harvard University and 
reflects his helpful feedback. Peter Goldman criticised earlier 
versions of the article and helped make it understandable. The 
comments of Iain Chalmers and A1 Fishman have been helpful, 
as was the dedicated research of Cleo Youtz. All errors and 
shortcomings of tire paper belong solely to the author. 
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Endpiece 

In praise of neurotics 

All the greatest things we know have come to us 
from neurotics. It is they and they only who have 
founded religions and created great works of art 
Never will the world be conscious of how much it 
owes to them, nor above all what they have suffered 
in order to bestow their gifts on it. 

Marcel Proust, Guermantes Way, 1921 
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